6 research outputs found
On the Effects of Data Heterogeneity on the Convergence Rates of Distributed Linear System Solvers
We consider the fundamental problem of solving a large-scale system of linear
equations. In particular, we consider the setting where a taskmaster intends to
solve the system in a distributed/federated fashion with the help of a set of
machines, who each have a subset of the equations. Although there exist several
approaches for solving this problem, missing is a rigorous comparison between
the convergence rates of the projection-based methods and those of the
optimization-based ones. In this paper, we analyze and compare these two
classes of algorithms with a particular focus on the most efficient method from
each class, namely, the recently proposed Accelerated Projection-Based
Consensus (APC) and the Distributed Heavy-Ball Method (D-HBM). To this end, we
first propose a geometric notion of data heterogeneity called angular
heterogeneity and discuss its generality. Using this notion, we bound and
compare the convergence rates of the studied algorithms and capture the effects
of both cross-machine and local data heterogeneity on these quantities. Our
analysis results in a number of novel insights besides showing that APC is the
most efficient method in realistic scenarios where there is a large data
heterogeneity. Our numerical analyses validate our theoretical results.Comment: 11 pages, 5 figure
Connectivity-Aware Semi-Decentralized Federated Learning over Time-Varying D2D Networks
Semi-decentralized federated learning blends the conventional device
to-server (D2S) interaction structure of federated model training with
localized device-to-device (D2D) communications. We study this architecture
over practical edge networks with multiple D2D clusters modeled as time-varying
and directed communication graphs. Our investigation results in an algorithm
that controls the fundamental trade-off between (a) the rate of convergence of
the model training process towards the global optimizer, and (b) the number of
D2S transmissions required for global aggregation. Specifically, in our
semi-decentralized methodology, D2D consensus updates are injected into the
federated averaging framework based on column-stochastic weight matrices that
encapsulate the connectivity within the clusters. To arrive at our algorithm,
we show how the expected optimality gap in the current global model depends on
the greatest two singular values of the weighted adjacency matrices (and hence
on the densities) of the D2D clusters. We then derive tight bounds on these
singular values in terms of the node degrees of the D2D clusters, and we use
the resulting expressions to design a threshold on the number of clients
required to participate in any given global aggregation round so as to ensure a
desired convergence rate. Simulations performed on real-world datasets reveal
that our connectivity-aware algorithm reduces the total communication cost
required to reach a target accuracy significantly compared with baselines
depending on the connectivity structure and the learning task.Comment: 10 pages, 5 figures. This paper has been accepted to ACM-MobiHoc 202
Mathematical Tools and Convergence Results for Dynamics over Networks
Mathematical models of networked dynamical systems are ubiquitous - they are used to study power grids, networks of webpages, robotic and sensor networks, and social networks, to name a few. Importantly, most real-world networks are time-varying and are affected by stochastic phenomena such as adversarial attacks and communication link failures. Time-varying networks, therefore, have been under study for a few decades. However, our current understanding of the dynamical processes evolving over such networks is limited. This observation motivates the two-pronged objective of this dissertation: (i) to use theoretical and empirical methods to analyze certain networked dynamical systems that cannot be studied using standard tools and techniques, and (ii) to develop suitable mathematical techniques for the systematic study of such systems.As our main contribution resulting from (i), we use the properties of random time-varying networks to provide a rigorous theoretical foundation for the age-structured Susceptible-Infected-Recovered model, a model of epidemic spreading. We then use system identification to show that the age-structured SIR dynamics accurately model the spread of COVID-19 at city and state levels in two different parts of the world – Tokyo and California.As for our contributions resulting from (ii), we extend two assertions of the Perron-Frobenius theorem to time-varying networks described by strongly aperiodic stochastic chains, thereby widening the applicability of the fundamental result that is foundational to probability theory and to the studies of complex networks, population dynamics, internet search engines, etc. Our results enable us to extend several known results on distributed learning and averaging. Moreover, they promise to advance our understanding of dynamical processes over real-world networks.As an application of these results, we study non-Bayesian social learning on random time-varying networks that violate the standard connectivity criterion of uniform strong connectivity. In doing so, we also make a methodological contribution: we show how the theory of absolute probability sequences and martingale theory can be combined to analyze nonlinear dynamics that approximate linear dynamics asymptotically in time.Finally, we study the convergence properties of social Hegselmann-Krause dynamics (which is a variant of the classical Hegselmann-Krause model of opinion dynamics and incorporates state-dependence into distributed averaging). As our main contribution here, we provide nearly tight necessary and sufficient conditions for a given connectivity graph to exhibit unbounded epsilon-convergence times for such dynamics
Recommended from our members
Mathematical Tools and Convergence Results for Dynamics over Networks
Mathematical models of networked dynamical systems are ubiquitous - they are used to study power grids, networks of webpages, robotic and sensor networks, and social networks, to name a few. Importantly, most real-world networks are time-varying and are affected by stochastic phenomena such as adversarial attacks and communication link failures. Time-varying networks, therefore, have been under study for a few decades. However, our current understanding of the dynamical processes evolving over such networks is limited. This observation motivates the two-pronged objective of this dissertation: (i) to use theoretical and empirical methods to analyze certain networked dynamical systems that cannot be studied using standard tools and techniques, and (ii) to develop suitable mathematical techniques for the systematic study of such systems.As our main contribution resulting from (i), we use the properties of random time-varying networks to provide a rigorous theoretical foundation for the age-structured Susceptible-Infected-Recovered model, a model of epidemic spreading. We then use system identification to show that the age-structured SIR dynamics accurately model the spread of COVID-19 at city and state levels in two different parts of the world – Tokyo and California.As for our contributions resulting from (ii), we extend two assertions of the Perron-Frobenius theorem to time-varying networks described by strongly aperiodic stochastic chains, thereby widening the applicability of the fundamental result that is foundational to probability theory and to the studies of complex networks, population dynamics, internet search engines, etc. Our results enable us to extend several known results on distributed learning and averaging. Moreover, they promise to advance our understanding of dynamical processes over real-world networks.As an application of these results, we study non-Bayesian social learning on random time-varying networks that violate the standard connectivity criterion of uniform strong connectivity. In doing so, we also make a methodological contribution: we show how the theory of absolute probability sequences and martingale theory can be combined to analyze nonlinear dynamics that approximate linear dynamics asymptotically in time.Finally, we study the convergence properties of social Hegselmann-Krause dynamics (which is a variant of the classical Hegselmann-Krause model of opinion dynamics and incorporates state-dependence into distributed averaging). As our main contribution here, we provide nearly tight necessary and sufficient conditions for a given connectivity graph to exhibit unbounded epsilon-convergence times for such dynamics